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Abstract 

After Boltzmann and Gibbs, the notion of disorder in statistical physics relates to 
ensembles, not to individual states. This disorder is measured by the logarithm of 
ensemble volume, the entropy. But recent results about measure concentration ef- 
fects in analysis and geometry allow us to return from the ensemble-based point 
of view to a state-based one, at least, partially. In this paper, the order-disorder 
problem is represented as a problem of relation between distance and measure. The 
effect of strong order-disorder separation for multiparticle systems is described: the 
phase space could be divided into two subsets, one of them (set of disordered states) 
has almost zero diameter, the second one has almost zero measure. The symmetry 
with respect to permutations of particles is responsible for this type of concentra- 
tion. Dynamics of systems with strong order-disorder separation has high average 
acceleration squared, which can be interpreted as evolution through a series of col- 
lisions (acceleration-dominated dynamics). The time arrow direction from order to 
disorder follows from the strong order-disorder separation. But, inverse, for sys- 
tems in space of symmetric configurations with "sticky boundaries" the way back 
from disorder to order is typical (Natural selection). Recommendations for mining 
of molecular dynamics results are presented also. 
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Introduction 



Is everything clear with the entropy growth? It seems that it is not. A collection 
of problem statements and approaches was published by Physica A on the eve 
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of the millennium [1,2,3]. Very recently, V.L. Ginzburg in his Nobel Lecture 
characterized this problem as one of the greatest challenges for physicists: 

The "great problems" are, first, the increase in entropy, time irreversibility, 
and the "time arrow" [4] . 

Wc usually describe the time arrow as disorder increase, and measure disorder 
by the (logarithm of) phase volume following the famous Boltzmann epitaph 

S^klnW 

W is the volume of an ensemble, and S is the entropy of this ensemble. The 
ensemble-based point of view was expressed recently in the following reasoning 
([5], p. 329): 

The well known question of what has more order, a fine castle or a pile 
of stones, has a profound answer: It depends on which pile you mean. If 
"piles" are thought as all configurations of stones which are not castles, 
then there are many more such piles, and so there is less order in such a 
pile. However, if these are specially and uniquely placed stones (for example, 
a garden of stones) , then there is the same amount of order in such a pile 
as in a fine castle. Not a specific configuration is important but an assembly 
of configurations embraced by one notion. 

It seems to be true, but it is not the whole truth. In this paper the ensemble- 
based point of view will be complemented by the state-based one: The notions 
of order and disorder can describe not only ensembles, but points also. 

The following toy-exam,ple gives us a nice possibility to understand the dif- 
ference between the state-based and the ensemble-based point of view, and 
helps us to learn how the measure of order and disorder depends on the human 
activity and perspective as well as on a state itself. Most of people are familiar 
with the situation described in this example. 

In book [6] the picture of "order" after intensive play of four children is pre- 
sented to illustrate the idea: the definition of order depends on a point of view, 
and the same set of positions and orientations of toys may serve as a represen- 
tative of rather big ensemble of equivalent disorders ("parents-room"), or as an 
almost unique configuration that changes sense after small change ( "children- 
room"). Children implicitly use the positions and orientations of all their toys 
in their play. For parents, these differences are not important. The same room 
(a state) produces different ensembles, it depends on perspective. The notion 
"order" distinguishes wide ensemble of the parents-room (big volume, disor- 
der) from narrow ensemble of the children-room (small volume, order), and 
the entropy measures this difference. This situation should be refiected on 
deeply before entering any discussion about order-disorder measurement. 
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This difference between the parents-room and the children-room can be for- 
mahzed by the vohimcs of equivalent configurations. For the parents-room, it 
seems to be larger, because the parents-equivalence is coarser (they use "other 
variables" for description of the state of the room). Here we meet the impor- 
tant operation that replaces a state (a point) by an ensemble. The simplest 
formal version of this operation is the so-called "fattening" : in a metric space 
with metric p{x, y) for any set A and £ > the £-fattening of A is the set 

A^ = {x : p{x, y) < e for some y G A}. (1) 

The set A^ includes all points that belong to A "with accuracy £." ^ Our 
first attempt to describe the difference between the parents-room and the 
children-room is the hypothesis that these ensembles are results of £-fattening 
for the same state (a point), but with significantly different e. The volume of 
the parents-room-ensemble is much higher than the volume of the children- 
room-ensemble. 

This point of view is not the final one. Later, in this paper, it will be comple- 
mented by the permutation analysis: the parents-room has more permutation 
symmetry than the children-room, and this causes significant difference be- 
tween their £-fattening even for the same e. The symmetrization occurs to 
be the most important operation for understanding of the difference between 
thermodynamic order and disorder. 

In this paper, the order- disorder problem is represented as a problem of rela- 
tion between distance and measure. The main focus of our consideration is the 
effect of order-disorder separation: for systems with a large number of parti- 
cles the available phase space (or configuration space) can be divided into two 
parts. One part has microscopically small diameter (part D, disorder), another 
part (part O, order) has microscopically small measure (volume). We call a 
quantity microscopically small, if it tends to when the number of particles 
tends to oo. Of course, a proper normalization of the volume and distance 
is assumed. As a consequence of the order-disorder separation it is worth to 
mention the existence of such a microscopically small £ > that for each point 
X from the part D its £-fattening {a:}e>o includes almost all volume (the rest 
of the volume is microscopically small) . 

We follow the idea of thin-thick decomposition (see M. Gromov book [7], p. 
124). The effect of order-disorder separation is one of the measure concentra- 
tion effects. The geometry of spaces with finite, but very large dimension has 
some interesting features that simplify the asymptotic picture in comparison 
both with the small dimensional, and the infinite-dimensional pictures. The 
typical questions refer to various asymptotic relations between the Lebesgue 

^ The fattening is similar to the Ehrenfest's coarse-graining [13,14]. 
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measure and the Euclidean distance. Recently, the effects of this kind have 
been studied very intensively [8,7,9]. Some links between concentration of mea- 
sure and works of Boltzmann, Maxwell, Gibbs, and Ehrenfest are presented 
below (nothing is absolutely new). 

For the measure concentration, that leads to order-disorder separation the 
permutation symmetry between particles (PI - Permutation Invariance) is 
important. 

The paper has the following structure. In the next section, two classical exam- 
ples of measure concentration are presented: the waist (or Maxwell) concen- 
tration of all the volume of multidimensional spheres near equators, and the 
boundary (or Gibbs) concentration of the volume of multidimensional balls 
near boundaries (spheres). In Sec. 2, the Feynmann analysis of an example of 
order increase is collated [10]. The order-disorder separation for the Feynmann 
example is demonstrated in Sec. 3. 

Below, we discuss the statistical idea of order /disorder only. It is based on the 
analysis of differences between less probable/more probable events for large 
systems. There exist many other notions of order /disorder, most important of 
them is the presence/ absence of a regular structure. We don't touch them in 
this paper. 



1 The classical measure concentration effects 

For large dimension n, the main part of the volume of the unit n-dimensional 
ball is concentrated in a small neighborhood of its boundary, that is 
the unit sphere S"'~^. This simple, but very seminal fact can be demon- 
strated, as follows. Let us use the normalized volume | • |: = 1. The 
correspondent (normalized) surface area of a unit sphere is a constant C„, 
l-S"! = lo CrJ^'^ dr, hence, C„ = n. The volume of the part of inside the 
£-neighborhood of S'^~^ is 

V^^l- j nr''-^ dr = 1 - (1 - s)". (2) 



For small e and large n (say, n > 1/e) we obtain the exponential estimate: 

= 1 - (1 - £)^"' ~ 1 - exp(-n£). (3) 

It implies that for given e and n — > oo the volume 14 —> 1 as 1 — exp(— ne) 
(exponentially) . 
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The sphere S^~^ can be considered as the isoenergetic surface for a very simple 
energy function, E = ^1: that is, for kinetic energy of n classical particles 
on a line, or for potential energy of n simplest classical oscillators. Of course, 
the observed concentration theorem could be proved for more general energy 
functions. Usually these generalizations are formulated as theorems of ensem- 
ble equivalence: for large n the canonical ensemble (ensemble with probability 
distribution that maximizes the entropy functional for a given average energy 
value) is equivalent to the microcanonical ensemble (that is equidistribution 
on the isoenergetic surface with respect to invariant Liouville measure). A.I. 
Khinchin (1943) [11] describes the probabilistic theory of ensemble equiva- 
lence when energy is a "sum function", this means that the system consists 
of a large number of noninteracting subsystems. This type of concentration 
we call the Gibbs concentration. The analysis of ensemble equivalence and 
nonequivalence is presented in Ref. [19] with relevant references. 

The same type of reasoning can be applied to a hemisphere iJ"^^ = {x G 
S*""^ : Xi > 0}: for large n almost all measure of the hemisphere i/" is 
concentrated near its boundary S""'"^ = {a; e S""'^ : xi = 0}. Hence, almost 
all measure of S"~^ is concentrated near its n — 2-dimensional equator. The 
exponential estimate of the type (3) is also valid. The well known application 
of this "waist concentration" is the Maxwell distribution for particle velocity: 
if the n-particle system in the velocity space is equidistributed on the sphere 
of radius = J2^=ivf = 3nkT/m, then, for large n, the velocity of one 
particle will be distributed due to the Maxwell distribution. The distribution 
of Vi has the Gaussian density -^^=^ex.-p{—vl/2a'^), where cr^ = kT/m. In 
other term, the projection of uniform distribution from the unit sphere S'^ 
onto the first axis has, for large n, the narrow (almost) Gaussian distribution 
exp{—vl/2a^), where a = l/y/n. 

Due to ensemble eqiiivalence this Maxwell concentration might be demon- 
strated as concentration of the projection of the equidistribution in the ball -B" 
on a line. This projection is a probability distribution on the segment [—1, 1] 
with the density ~ — x^)". For large n, (\/l — a;^)" ~ exp(— na;^/2), and 
the projection density approaches the Gaussian distribution y^exp(— na;^/2) 
with the standard deviation a — 1/ ^Jn. 

The waist concentration holds not only for projection on coordinate axis, 
but for any (nonlinear) Lipschitz function F{x) with Lipschitz constant 1 
i\F{x) — F{y)\ <\x — y\): for large n, the values of such a function on S'^ are 
concentrated in a ;^-small interval around the median value F defined by the 
following statement: 

P{F{x) > F) > ^ and P{F{x) < F) > ^. 
It is the Levy theorem [12]. 
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The Maxwell distribution was known before statistical mechanics was devel- 
oped by Gibbs (and almost at the same time and independently by Einstein). 
The waist concentration, in this sense, was discovered by Maxwell. 

Let us mention one important property of the waist concentration: the points 
on the sphere are distributed uniformly, and are equivalent in any reasonable 
sense: the measure is concentrated near every equator. In one-dimensional pro- 
jections (both linear and general Lipschitz) this symmetry is destroyed, and 
there are distinguished points, the median and its ;^-small neighborhood. 
The complement of this set has small measure, and this set of distinguished 
points has small diameter; and, of course, a small vicinity of any distinguished 
point has the same property, it has the "almost full" measure, and the small 
diameter. It does not matter, if this projection is linear or not, only the Lip- 
schitz property is important. It makes no difference if the projection is not 
one-dimensional: for any given dimension and for the number of degrees of 
freedom n — > oo the result is the same. The final results concerning the waist 
concentration for different possible relations between n and dimension of pro- 
jection were obtained by M. Gromov [9]. 

We can call the distinguished points as "thermalized" states, or "near-equili- 
brium" states, but initially, on the multidimensional sphere, all the states 
are equivalent, and the distinguished points of measure concentration emerge 
only in a macroscopic projection. In the following section we will present the 
order-disorder separation for microscopic state. 



2 Strong order-disorder separation for symmetric microscopic states 



All the classical statistical physics is the theory of symmetric ensembles: the 
density p{xi,X2^ ■ ■ - Xn) of the full multi-particle probability distribution is 
symmetric with respect to particles permutations (here phase point 

for the i-th particle). In this section, we demonstrate the concentration effect 
that emerges in the projection of the phase space (or configuration space) of 
n particles onto the space of permutations orbits. The n-particle space is P", 
where P is an one-particle space. The space of orbits can be presented as 
the space of n-point subsets in the one-particle space P (in the measure and 
distance discussion for continuous spaces we can neglect the degenerate case 
when positions of some particles coincide). 
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2.1 Feynman's blue and white atoms mixing 

Let us start from a simplest example of blue and white atoms mixing analyzed 
in the book "The Character of Physical Law," by R. Feynman [10] . 

You have atoms of two different kinds (it's ridiculous, but let's call them 
blue and white) jiggling all the time in thermal motion. If we were to start 
from the beginning we should have mostly atoms of one kind on one side, 
and atoms of other kind on the other side. Now these atoms are jiggling 
around, billions and billions of them, and if we start them with one kind 
all on one side, and the other kind on the other side, we see that in their 
perpetual irregular motions they will get mixed up, and that is why the 
water becomes more or less uniformly blue. ... 

If you start with a thing that is separated and make irregular changes, 
it does get more uniform. But if it starts uniform and you make irregular 
changes, it does not get separated. It could get separated. It is not against 
the law of physics that the molecules bounce around so that they separate. 
It is just unlikely. It would never happen in a million years. And that is the 
answer. 

This discussion is interesting not only by the clearly explained thing, but by 
the carefully hidden things also. Let P be the box where the atoms move. The 
configuration space is P", where n is the number of particles. The separated 
configurations ( "with one kind all on one side, and the other kind on the other 
side") form an ensemble (a "drop") with volume 2'^ times smaller than the 
whole volume of P". The concentration effects in the velocity spaces allow 
us to represent the correspondent ensemble in a phase space as a drop with a 
constant density inside it also (for example, with equidistribution in a velocity 
ball). It is convenient for discussion. The volume of this drop is 2" times 
smaller than the equilibrium volume (hence, the density is 2" times larger). 
This volume is conserved in the mechanical motion. Hence, after some time 
this ensemble become more mixed, but remains "oil in water" , that is, a phase 
space drop with the same volume and density. In the sense of ensembles it is 
not a "uniform" ensemble, and if somebody (the Maxwell demon, for example) 
carefully inverted all the velocities, this ensemble would return to the initial 
separated state. 

What does Feynmann mean: "starting from homogeneous state we never will 
get the separation... ?" It is absolutely new ensemble "uniform states" , it is not 
a result of the initial ensemble evolution. Starting from the initial separated 
state we do reach some of the "uniform states", but not all such states. The 
phase volume is different. For "all uniform states" it is 2" larger, where n is 
the number of particles. How can we get all the uniform states (ensemble U) 
from the states we can reach from our ordered states (ensemble 0)7 
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And here Feynman uses an unexpected new notion, irregular changes: "If you 
start with a thing that is separated and make irregular changes, it does get 
more uniform." And back: "if it starts uniform and you make irregular changes, 
it does not get separated." 

Who and how makes these irregular changes and what does it mean? The 
small portion of irregular changes makes the mixed "oil in water" ensemble 
strictly uniform. Where did this concept come from? We can find a source of 
this idea in the coarse-graining. 

The idea of coarse-graining dates back to P. and T. Ehrenfests, and it has been 
most clearly expressed in their famous paper of 1911 [13]. Ehrenfests consid- 
ered a partition of the phase space into small cells, and they have suggested 
to supplement the motions of the phase space ensemble due to the Liouville 
equation with "shaking" - averaging of the density of the ensemble over the 
phase cells. As a result of this process, the convergence to the equilibrium 
becomes uniform out of the convergence in average. It is the fattening that we 
mentioned in Introduction. This "fattening-based" approach was developed 
into a general technique of nonequilibrium thermodynamics [14,15]. What is 
the physical nature of the e-fattening? First interpretation is noise, any kind of 
small noise, small perturbations, and e is the amplitude of this noise. Another 
interpretation of s is the possible accuracy of measurement and control. 

But there is a purely mechanical effect: we start from the state with small 
volume of its ^-fattening and after some time of motion the system typically 
reaches states with large volume of their £-fattening. 

After some time of mechanical motion a typical state (a point, not an ensem- 
ble) becomes "thick" : permutation symmetrization with microscopically small 
fattening transforms this point into an uniform ensemble. The explanation of 
this effect is based on the study of the geometry of a multidimensional simplex 
that we perform in the next subsection. 

Let us watch blue particles only, and an one-dimensional box P = [0, 1] (in 
the direction of separation x). In order to represent the set ^ of n particles as a 
point in a standard simplex, we introduce symmetric coordinates for n-particle 
systems. Let us enumerate particles in the order of x value: — Xq < Xi < 
X2 < ■ ■ ■ < Xn < Xn+i — 1. Symmetric coordinates are: 

Si — Xi (4) 

^ Positions of some particles can coincide, and, rigorously, a "set" of n particles 

forms an unordered tuple. An unordered tuple of length n of set P is a unordered 
selection with possible repetitions of set P and is represented by a sorted list of 
length n. In one-dimensional case it is convenient to sort positions (numbers) in 
ascending order. 
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where i = l,...n + 1. Unordered n-particle states form in coordinate Sj a 
standard simplex A„: Sj > 0, J^i^i = 1- The configuration vohime transforms 
into a uniform distribution in this simplex with a constant density n\. 



Of course, it is possible to study the space of permutation orbits as a quo- 
tient space endowed by quotient metrics. For the Euclidean metric in the 
one-particle space, the quotient metrics is 



dQ{{Xi,X2, Xn}, {yi, Z/2, • • • , Vn}) 




Vad) I 



1/2 



(5) 



where minimum is calculated for the set of all n-particle permutations a. Nev- 
ertheless, the use of symmetric coordinates is more transparent. There are 

several other symmetric representation of n-particle systems: measure repre- 
sentation and functional (distance) representation. They are discussed below. 



2.2 Distance-measure relations in large-dimensional simplex 



Let us consider an n-dimensional standard simplex A„. The normalized equidis- 
tribution in A„ has the constant density n\. We call the correspondent proba- 
bility measure the normalized volume, and use notation Vol(«): Vol(A„) = 1. 
When discussing the probability, we identify the probability of an event P{»} 
with the volume of a correspondent set Vol(«). 

For large n, almost all volume of the simplex A„ is concentrated in a small 
neighborhood of the center of A„, near the point c = (^^, ^, . . . , . The 
Euclidean radius of this neighborhood R can be chosen of order ~ A 
projection of an n-dimensional Euclidean ball with unit radius on a line is 
concentrated in an interval of length ~ n"^^"^. It is the Maxwell (the waist) 
concentration. Hence, any projection of an n-dimensional standard simplex 
on a line is concentrated within an interval of length ~ n~^. This is true not 
only for orthogonal projections, but for any Lipschitz functions with Lipschitz 
constant 1 (1-Lipschitz functions), as it is for Levy concentration. (See [7], p. 
235.) 

In order to demonstrate the main concentration properties of a simplex, let us 
start with the moment evaluation. The moments give this estimate of concen- 
tration radius in simplex, but only power estimates of deviations are achievable 
on this way. Let us follow Chebyshev's inequality for positive random variable 
> a} < E(^)/a, where E(^) is the expectation of ^ (the average). 

The distribution density for value s of one coordinate Sj in n-dimensional 
standard simplex is pi{s) = n(l — s)"~^, the mutual density function for two 
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coordinates, si, S2 is P2(si, S2) = n{n — 1)(1 — si — 52)" ^, for k coordinates 
Si, S2, ■ ■ ■ , {k < n) the mutual density is 



S2, . . . , Sk) = ( 1 - "^i 



k \ "-^ 



The first moments are: E(s) = l/(n + 1) = 1/n + o(l/n), E(s^) = 2/[(n + 
l)(n + 2)] = 2/n2 + o(l/n2), 

Var(s) = E(s2) - (Efs))^ = ^- = L + o(\ 

^ ' ^ ' ^ ^ " (n+ l)2(n + 2) n? \n? 

and for k < n, 

^. 1 {n-k)\k\ k\ ( 1 

(the last equahty holds for any given k and n 00). For the first mixed 
moments we get E(siS2) = + l)(n + 2)] = Ijr? + o(l/n^), 

Cov(si, S2) = E(siS2) - E(si)E(s2) = - ^ = + o f ^) , 

and for the correlation coefficient 

Cov(si,S2) 1 



Cor(si,S2) = 



^Var(si)Var(si) ^' 



It is worth to mention that Cov(si, S2) has order Var(s) has order 
hence, correlations between coordinates decrease as (coordinates become 
independent for large n, and correlation decrease is a symptom of this in- 
dependence). It is easy to calculate moments of the square of the Euclidean 
radius = YA=l{si - E(si))^, for example 

E(i?2) = (n + l)Var(s) = = 1 + o f- V 

^ ' ^ > ^ ' (n+ l)(n + 2) n \n) ' 

and the Chebyshev's inequality gives the simplest estimate: 

Vol{a; e A, : > < 1 ^ 

frn 



up to the leading order in n. 

With the higher moments of we can obtain estimates with the higher pow- 
ers of 1/n, but already a simple geometrical consideration gives exponential 
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estimates. For any i — 1 . . .n, the part of A„, where Sj > e, has the normahzed 
volume 

Vol(s e A„ : Si > £) = (1 - s)'' fti exp(-£n). (7) 



Hence, the set C A„, where < £ for all i — 1, . . . ,n + 1, has the 
normalized volume 

V,>{l-(n+ 1)(1 - £)") ^ 1 - nexp(-£n). (8) 



For any point x = (si, . . . , Sn+i) G the following inequality holds: = 
Y^7=i ^1 — ^7=1 — ^- Therefore, the intersection of A„ and a Euclidean 
ball 5^"+^) with the center c includes the set K^, if £ < p^. Hence, for the 
normalized volume of this intersection, Wp, the following inequalities hold: 



Vol{x e A„ : > p2} ^Wp> Vp2 

> (l-(n + l)(l-p2)")Ril-nexp(-p2n). (9) 



The estimate (9) implies that for any given positive constant a < 1 there exists 
a positive constant b such that VF^ in n/Vn > fo^ ^- other words, for any 
given share a of the simplex volume there exists such a constant b > that 
the Euclidean ball b[^, r- with the center c includes this part of the volume 
for all n. We can guarantee with (9) that the radius of such a ball goes to zero 
as \n.nl\fn. 

A precise analysis of the concentration effects in balls and in a standard 
symplex was performed in [7,16]. 

The concentration of a simplex measure in a small vicinity of its center can be 
considered as an effect that is opposite to the Gibbs concentration of volume 
of a n-dimensional ball i?„ in a small vicinity of its boundary, the sphere. 
On the other hand, it is similar to the waist concentration. And now not 
only the values of macroscopic projections can be separated onto two sets: 
one with a microscopically small diameter, the other with a microscopically 
small measure, but also the set of the symmetrized microscopic states. The 
symmetrization with respect to particles permutations plays the same role as 
the macroscopic projection. We can say now that this symmetrization is the 
main step in the micro-macro transformation. 
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2.3 Symmetric coordinates for multidimensional phase space 



In order to demonstrate the same effect for one-particle configuration space (or 
pfiase space) of non-unit dimension, let us consider a product of m simplices 

An for m ^ n"' and some power a. Euclidean diameter of AJ^ grows with m as 
,,yrn ~ n"/^, Euclidean diameter of the product of Euclidean balls (5^"+^))"^ 

is 2^Jrn,p ~ pn'^l''^ . For the normalized volume of the intersection A^Pl-B^^^^^ 
the following estimate holds: 



Vol (A- n 3'%^"^) > Vol (A™ n 

^ (1 - nexp(-pv^))"^ ~ 1 - n^+"exp(-pV^). (10) 

Prom this estimate it follows that the strong order-disorder separation holds 
for these Cartesian degrees of simplex also (if m ~ n"): almost all volume 
belongs to an Euclidean ball with the relatively small diameter R ~ pn"'^'^. In 
order to include in this ball any given share of volume we can choose p ~ n~^/^ 
with appropriate value of the prefactor. Therefore, the correspondent relation 
of diameters i?/Diam(A^) goes to zero as 

Let one-particle space be A;-dimensional unit cube Qk- The space for n-particle 
system is {Qk)"'- We produce the symmetric map of {Qk)" onto product of n^^^ 
dimensional simplices (A^i/*;)^"*" as follows. Let ^j, i = 1, . . . /c, < { < 1 
be coordinates in Qk- With each coordinate axis we construct a projection of 
{Qk)^ onto (A^i/fc)"^""^'^*". The product of k such projections is the resulting 

map (g^)" ^ (A„v.)^"^""^'^ 

For ^k-i this projection is the top floor of the "staged tower" of symmetric 
coordinates. Let us first enumerate particle in the order of value: Q — xq < 
xi < X2 < ■ ■ ■ < Xn < Xn+i = 1. First set (the ground floor of the "staged 
tower") of symmetric coordinates is: Si — Xi — where i = 1, . . . n -|- 1. 

Let us divide the particles into n^^'^ groups G', / = 1, . . . ,n^/'^ with n''^~'^)l^ 
elements in each group in the same order: first n^^~^^^^ particles with coordi- 
nates ^1 = Xi^X2^ ■ ■ ■ x^^(k-i)/k belong to the first group, G^, then follow rS'^^^^/^ 
particles from the second group, etc. Let us enumerate particle of each group 
in the order of ^2 value: Q = Xq < x\ < X2 < ■ ■ ■ < x^^^k-i)/k < = 1, 

where superscript / is the group number. ^ 

The first fioor of the "staged tower" consists of n^/^ sets of symmetric coor- 

^ Of course, it is more rigorous to speak about integer parts of numbers: 
consists of IntegerPart(n('^^-'^)/'^') elements, consists of IntegerPart(2n('^^-'^)/^) — 
IntegerPart(n*^^~^)/'') elements, and so on, but it adds nothing to the sense, only 
the notations become cumbersome. 
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dinates s\ — x\ — x\_]^. After that, we can divide each into n^/^ groups 
Qim^ m = l,...,in}/'^ with n^'^^'^)/^ elements in each group in the order of 
^2 value. Let us enumerate particles of each group in the order of ^3 value: 
= 4"" < 2^1™ < 4"" < ■ ■ ■ < ^LT*-2)/fe < 4™-2)/fc+i = 1- The second floor con- 
sists of n^/^ sets of symmetric coordinates s'"* = x^^ — x^^^. Finally, we get k 
floors (from the ground to {k — l)st one). The floor number j (j = 0, . . . /c — 1) 
consists of n^/^ groups of symmetric coordinates with n!^^~iy^ coordinates 
in each group. These coordinates are non-negative, their sums in groups are 
equal to 1. Therefore, each floor represents a n-dimensional polyhedron that 
is a product of nP^^ standard simphces A„(fc-j)/fc of dimension nP^"^)!^^ and the 
whole tower represents the following product of simplices 

^k,n = n ^J' W^e^e = (A„(fc-i)/fc)"'^\ (11) 
3=0 

We are interested in the (A; — l)st floor that corresponds to ^k- It is ilk-i — 
(A„i/fe)"'" Analogous projection for other could be obtained by coor- 
dinates permutation (cyclic). 

Wc sec that for a /c-dimensional one-particle space the result is qualitatively 
the same as for one-dimensional. The only difference is that here the estimates 
guarantee that the relative Euclidean radius (that is, the relation of the radius 
to the diameter of the whole space) of the set, where an arbitrary part a < 1 
of measure is concentrated, tends to zero as instead of l/i/n- Here d 

is the dimension of one simplex from the product, that is, d — n}/^ and the 
relative radius goes to zero as n~^/'^'^^\ 

This change of order reflects a simple fact: the typical distance from a particle 

to the nearest particles in dimension A; is ~ n~^f^. After summation of d 
squares of such variables we get the square of radius: ~ n"^^''. Then we 
take the n''^~^'*/'^th power of (i-dimensional simplex and of the ball from this 
simplex also. The relation of the Euclidean radii does not change after this 
operation. It remains ~ n~^^^'^^\ 

The same results hold for one-particle space P that is not a cube, but a 
bi-Lipschitz image of a cube, or can be covered by flnite number of such 
images. We discuss much more general metric-measure (mm) spaces in the 
next subsection. 

2.4 Other natural distances on symmetrized states 

A metric space P with distance d{x, y) and a given measure on P is a 
mm-space [7], if every metric ball is measurable. In this section we discuss 
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distribution of particles in a mm-space P with a probability measure hence, 
/.(P) = 1. 

We assume that P is compact ^ and, hence, has a finite diameter. The space 
of (Radon) measures on P is C*{P), that is the conjugated space to the space 
of continuous functions C{P) on P. The action of a measure u G C*{P) on 
a function / G C{P) is the number [z^, /]. The action of probability measure 
/X on / is the expectation: [/x, /] = E(/). The probability measures on P are 
positive and normalized elements of C*{P). For measures we use the weak* 
convergence: /li — > /iq if [/li, /] — > [//q, /] for every continuous function /. 

An unordered tuple of n points ( "particles" ) in a mm-space P can be repre- 
sented as a probability measure: 

1 " 

{Xi, X2, ■ ■ ■ , Xn} I— > ^J'xi,X2,-,Xn ~ ~ ^ ^Xij (12) 

where 6^^ is a unit measure concentrated at the point Xi (5-function). The 
law of large numbers states that fixi,x2,..;X„ ~* 1^ for almost all sequences 
{xi,X2, . . . ,Xn, . . .} G P°°. "Almost all" means here: the set of exceptions 
has zero measure. Let / be an arbitrary bounded continuous function on P. 
The standard law of large numbers for a random variable / and the probability 
space P immediately gives the weak* convergence: the sequence of averages 
{f)n = ~I]r=i/(^*) converges to the average value of / with respect to the 
probability measure /x, that is, to the expectation E(/). 

For each domain W C P with non-zero measure IJ,{W) the probability that 
all particles are outside W is 

P{xi ■.i^l,...n} = {1- /i(l^))". (13) 

This estimate is analogous to the estimate of the volume of one wing of the 
n-dimensional standard simplex (7). 

Moreover, all balls Bp of given radius p are nonempty with almost unit prob- 
ability. Let us take a p/2 net {yi, . . . , ym} in P, where m = Capp/2(-P) is the 
minimum number of points in p/2 net in P. Each ball Bp in P includes a ball 
Bp/2{yi) of radius p/2 and center in one of the points yi, . . . , y^. Therefore, if 

^ Generalization of most statements to complete, but non-compact mm-spaces (for 
example, to important case of locally compact space) is often possible because the 
probability measure p is concentrated on a compact subset of P up to any given 
accuracy, and after cutting a "tail" of distribution ^ we can return to compact space. 
The theory of large deviations and equidistribution in general spaces is presented 
in Refs. [17,18]. 
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there is a ball Bp <Z P free of particles, then at least one of the balls Bp/2{yi) 
{i — 1, ... ,171) is also free of particles. 

P{Each ball Bp C P includes a particle} 

CaPp/2(-P) 

>1- J2 (l-M^p/2(2/.)r >l-Cap,/2(P)(l-Mp/2)r, (14) 
1=1 

where = mf{n{Bp){y) : y G P}. This estimate is similar to the estimate 
(8) of the joint volume of the n-dimensional simplex wings. In the final estimate 
(14) two characteristics of the mm-space P are used: the minimum number of 
points in p/2 net, Capp/2(-P)) ^iid the minimal volume of a ball of radius p/2, 
/i.(p/2). There is also a difference between (8) and (14): the analogue for the 
"number of wings" for the last estimate, Ca.pp/2{P), does not depend on n. 

Natural metrization of the space of probability measures on P in the weak* 
convergence gives the following metric [20,7]: 

Lid(u,r]) ^ sup \[u -r],f]\, (15) 
/ 

where / runs over all 1-Lipschitz functions on P. 

Another, functional representation of an unordered tuple of n points ( "parti- 
cles" ) in a mm-space P is a continuous function 

{xi,X2,...,Xn} ^ fxi,x2,...,xn ■ fxux2,-,xA^) = min d{x,Xi). (16) 

1=1,. ..,n 

This functional representation is an exact analogue for the simplex representa- 
tion (4): in one-dimensional case the maximum norm of the function fxi,x2,-,xn 
is ImaxjSj, the average of \fxi,x2,-,xnix)\^ is 2P(p+i) -^f^^- Particularly, the 
square of the Euclidean (i.e. L2) norm in a simplex is proportional to Li norm 
of the function fxi,x2,-,xn- 

For this representation the estimate (14) has a simple form 

m 

n\\h„x2,...,xjL^ < p} > 1 - ^(1 - ^x{Bp,2{yi)T 

i=l 

> l-m(l-Mp/2))", (17) 
where || • is the maximum norm. 

In many practically important mm-spaces P the volume of balls is of order 
for some power A; > 0: ini ^{B p) / = a > 0. In that case, for sufficiently 
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small p and large n 

(l-Mp/2)r ~exp(-an(p/2)^) 

and 

P{||/.,.„...,.J|l^ < p} > 1 - Cap,/2(P) exp(-an(p/2)^). (18) 

where Capp/2(-P) is the minimum number of points in p/2 net in P. 

For Li norm of fxi,x2,..;Xn (analogue for the square of the Euclidean norm in 
a simplex of symmetrical coordinates) we obtain the estimates 

||/a;i,a;2,.--,a;n l|ii ^{fxi,X2,.--,Xn) — ^^^\fxi,X2,..;X„{^)\ 

and 

'P{\\fxuX2,...,xJ\Li <b}> P{||/xi,a;2,...,a;„|Uoo < H 

> 1 - Cap,/2(P) exp(-an(V2)*=). (19) 

And again we observe the simplex-type strong order-disorder separation. In 
the maximum norm, the whole set of functions that represent n-point tuples 
has diameter Diam(P). The measure is concentrated in a ball or radius ~ 
(18) (in the maximum norm also). 

There exists a simple connection between measure (12) and functional repre- 
sentations (16) of n-particle systems. Let the radius of a ball with the centre x 
and volume 6 is continuous function rs{x) for any 5 > 0. For each continuous 
function f{x) the Li function f^{x) is defined: 



i, iif{x) < rsix); 



(20) 

0, a fix) >r six). 



The distribution ^fxi,x2,...,xnl^ approximates the measure iJ>xi,x2,-,xn (12) when 



0. 



2. 5 Statistics of local structures 



In this paper, we discuss the statistics of large sets of particles with permuta- 
tion invariance. For spaces of /c-particle sets, two embeddings are considered: 
into space of measures (12) and into spaces of functions (16). These embed- 
dings are useful for theoretical purposes, but for practical needs embeddings 
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into finite-dimension Euclidean space are necessary, as well as systems of in- 
ternal coordinate charts on spaces of fc-particle sets with one labeled point. 
In this subsection, we discuss the embedding and coordinate choice and give 
a non-technical introduction into statistical analysis on non-Euclidean metric 
spaces. 

Molecular dynamics gives us many examples of particle configurations. We 
never had such detailed information before, and the question is how to process 
it with maximally useful output. The classical approach of statistical physics 
is based on fc-particle distribution functions for small k. It is not sufficient, for 
example, for the following problems. 

Let the configuration of n particles be given: we know all the positions of 
molecules. For each particle (point x) and any k < n we define a /c-particle 
local configuration, or germ (/c-germ) of the configuration at x, that is the set 
of k particles nearest to x represented in the reference system with origin x. 
The set of fc-germs for all possible central particles form a cloud of points in the 
space of /c-germs. Are there clusters or clots in this cloud? Is the distribution 
of /c-germs in physical space (R^) homogeneous? If it is heterogeneous, then 
how can we find boundaries between locally homogeneous clusters? 

For systems in isotropic conditions, instead of /c-germs it is necessary to con- 
sider orbits of /c-germs under the action of rotation group. Molecules with in- 
ternal structure can also be considered without principal problems (but with 
some technical complications). 

The problem of local heterogeneities in water is most attractive [21,22,23]. 
But even for hard spheres systems the cluster boundaries localization is not 
trivial. 

It is not obvious, how many particles in local configuration should we take 
into account: where the heterogeneities are hidden. It is necessary to study 
statistics of /c-germs for different k and evaluate the informativity of transition 
from k to k + 1. 

The classical statistical geometry gives some tools for quantitative analysis of 
configuration structure. Important sources of ideas and methods for the local 
configuration analysis are the theory of random packing [24], the molecular 
geometry of liquids [25] and the theory of liquid-glass transition [26]. The main 
tool of statistical geometry that is in wide use for molecular dynamics data 
mining is the analysis of the Voronoi polyhedra and the Delaunay simplices 
statistics [27,28,29]. The Voronoi polyhedron is the domain around a particle, 
such that all points of this domain are closer to this particle than to any other. 
A group of four particles, whose Voronoi polyhedra meet at one vertex, forms 
another basic object of statistical geometry, the Delaunay simplex. Statistics 
of the Voronoi polyhedra and the Delaunay simplicec gives us information 
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about local order in the nearest vicinity of particles: the Voronoi polyhedron 
describes the coordination of the nearest atomic environment while the De- 
launay simplex describes the shape of the cavities between the nearest atoms. 

In order to extend this vicinity to an arbitrary number of neighbors and co- 
ordination spheres, we need the statistics of /c-germs. We propose systematic 
study of statistics of A:-germs: (nonlinear) principal component analysis, clus- 
ter analysis, and analysis of the fields of obtained statistical characteristics 
in the physical space-time. Many old question could be revisited in this way, 
especially the problems of local heterogeneity. 

The crucial question is the choice of a space where the statistical analysis will 
be performed. 

For statistical computations, the embedding of the space of germs into Eu- 
clidean space is convenient. For any sequence of functions in R^, T — {/i, . . . /m} 
let us define 



These coordinates serve for computation of distance p between germs (just a 
standard Euclidean distance in these coordinates can be chosen; the second 
choice is the locally Euclidean Riemannian metric, the geodesic distance). 

For systems with rotational symmetry, it is necessary to study statistics of ro- 
tational orbits of germs. The space of functions spanned by {/i, . . . } should 
be rotationally invariant and represented as a sum of irreducible subspaces. In 
this case, the coordinate tuple for a germ (21) is a direct sum of irreducible 
tensors, and it is easy to write the complete system of rotational invariants 
and to define invariant distance on the space of germs. 

For this purpose, it is convenient to choose {/i, . . . /m} as eigenfunctions of 
a Schrodinger operator with central force, sorted by eigenvalues and momen- 
tum (isotropic oscillator eigenfunctions, for example). These eigenfunctions 
are spherical harmonics multiplied on radial functions /j (p) that decay when 
p — oo, hence, a far particle has less influence on the distance between germs 
than the nearest one. 

Statistics in Euchdean spaces with coordinates (21) is not statistics of germs: 
the average of germs for this Euclidean statistics is already not a germ. Let us 
consider the space of germs as a non-Euclidean metric space with metric p. 

Following Frechet [31], we can define an average point {z) for a finite subset 
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{zi, . . . Zq} of a metric space /C as a minimizer of average squared distance 
(z) = argmin.g^ J i ^ p\z, Zj) \ . (22) 



On the base of this approach, statistics on Riemannian spaces is developed, 
from simple averaging to moments calculation and definition of normal dis- 
tribution [32] . For shape statistics, the method of principal geodesic analysis 
is proposed, that is a generalization of principal component analysis to the 
manifold setting [33]. 

We can interpret the Prechet averaging (22) as minimization of elastic energy 
of springs that connect data points with an average point. The statistical anal- 
ysis on metric spaces may be represented as minimization of "elastic energy" 
[36,37,39]. This energetic metaphor works successfully for model reduction 
problems, cluster analysis and analysis of data with complex topology. Let 
us give a sketch of this approach following [40] . For simplicity, we consider a 
metric space embedded into Euclidean space with Euclidean distance between 
points. 

Let G be a simple undirected graph with set of vertices Y and set of edges 
E. For A; > 2 a /c-star in G is a subgraph with k + 1 vertices |/o,i,...fc G Y and 
k edges {{yo^yi) \ i = I, ■ ■ ■ k} C E. Suppose for each /c > 2, a family Sk of 
/c-stars in G has been selected. We call a graph G with selected families of 
/c-stars Sk an elastic graph if, for all i?^*^ G E and Sjf ^ G Sk, the correspondent 
elasticity moduli A, > and fikj > are defined. Let E^^\0), E^^\l) be vertices 
of an edge E'^^^ and Sl^\o), . . . Sl^\k) be vertices of a A;-star Sl^^ (among them, 
S'^"'''(0) is a central vertex). For any map : y — > R"^ the energy of the graph 
is defined as 



U^iG) := Y: K W^iE^'^m - '^>{E^'\l))\\ (23) 

E(i) 



j:<p{si^\t))-k<p{si^\o)) 

i=l 



Very recently, a simple but important fact was noticed [41]: every system of 
elastic finite elements could be represented by a system of springs, if we allow 
some springs to have negative elasticity coefficients. The energy of a A;-star 
in R"^ with yo in the centre and k endpoints yi^,,,k is tig,, = A^sfc(5^f=i Vi ~ ^'UoY-i 
or, in the spring representation, Us^ = kfis^ Ef=i(yi - yof - l^s^ T.i>j{yi-yjf- 
Here we have k positive springs with coefficients k^s^ and k{k — 1)/1 negative 
springs with coefficients — /^Sfe- 
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For a given map (j) :¥ ^ /?™ wc divide the dataset D into subsets K^, y gY. 
The set contains the data points for which the node 0(y) is the closest one 
in 4>{Y). The energy of approximation is: 

ui{G,D):^Y: E Hx)\\x-mr, (24) 

yeYxeKv 

where w{x) > are the point weights. 

The simple and very popular algorithm for minimisation of the energy = 
U'^{G,D) + U^{G) is the splitting algorithm, in the spirit of the classical k- 
means clustering: for a given system of sets {K^ \ y e Y} we minimise V^, 
then for a given we find new {K^}, and so on; stop when no change oc- 
curs. This is the constrained minimisation: the nodes move along the A;-germs 
space embedded into Euclidean space, while the distance in this example is 
Euclidean one. This algorithm gives a local minimum, and the global minimi- 
sation problem arises. There are many methods for improving the situation, 
but without guarantee of the global minimisation. 

The next problem is the elastic graph construction. Here we should find a com- 
promise between simplicity of graph topology, simplicity of geometrical form 
for a given topology, and accuracy of approximation. Geometrical complex- 
ity is measured by the graph energy [/"^(G), and the error of approximation 
is measured by the energy of approximation U'^{G,D). Both arc included in 
the energy W^. Topological complexity will be represented by means of ele- 
mentary transformations: it is the length of the energetically optimal chain of 
elementary transformation from a given set applied to initial simple graph. 

Graph grammars [42,43] provide a well-developed formalism for the description 
of elementary transformations. An elastic graph grammar is presented as a set 
of production (or substitution) rules. Each rule has a form A ^ B, where A 
and B are elastic graphs. When this rule is applied to an clastic graph, a 
copy of A is removed from the graph together with all its incident edges and 
is replaced with a copy of B with edges that connect B to graph. For a full 
description of this language we need the notion of a labeled graph. Labels are 
necessary to provide the proper connection between B and the graph. 

A link in the energetically optimal transformation chain is constructing by 
finding a transformation application that gives the largest energy descent (af- 
ter an optimization step), then the next link, and so on, until we achieve the 
desirable accuracy of approximation, or the limit number of transformations 
(some other termination criteria are also possible). 

As a simple (but already rather powerful) example we use a system of two 
transformations: "add a node to a node" and "bisect an edge." These trans- 
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formations act on a class of primitive elastic graphs: all non-terminal nodes 
with k edges are centers of elastic k-stars, which form all the A;-stars of the 
graph. For a primitive elastic graph, the number of stars is equal to the number 
of non-terminal nodes - the graph topology prescribes the elastic structure. 

The transformation "add a node" can be applied to any vertex y of G: add a 
new node z and a new edge {y, z). The transformation "bisect an edge" is ap- 
plicable to any pair of graph vertices y, y' connected by an edge (|/, y'): delete 
edge (i/, y'), add a vertex z and two edges, (|/, z) and {z, y'). The transformation 
of elastic structure (change in the star list) is induced by the change of topol- 
ogy, because the elastic graph is primitive. This two-transformation grammar 
with energy minimization builds principal trees (and principal curves, as a 
particular case) for datasets. 

For applications, it is useful to associate with these principal trees one-dimensional 
continuums. Such a continuum consists of node images 0(y) and of pieces of 
lines that connect images of linked nodes. 

The first task of A;-germs statistical analysis is dimension reduction. The method 
of choice here is principal component analysis (PCA). Its linear version is now 
classics and textbook material [34], and nonlinear PCA is developed recently 
[35,36,38,39]. The methods of elastic manifolds and graphs [39] does not re- 
quire Euclidean space of data. The second task is cluster analysis. The de- 
scribed method of elastic graphs is a tool for nonlinear PCA, and for cluster 
analysis, both. 

The third task that is specific for statistical physics is the analysis of fc-germs 
distribution in physical space-time. After that, we can discuss structural non- 
uniformity, quasi-chemical representation of kinetics [5] , and many other top- 
ics. Of course, additional topological information about various bonds between 
particles could be added to this metric description. 

Internal coordinates on the space of germs are necessary for gradient opti- 
mization of energy. Topologically, the space of fc-germs near a point x can 
be defined as the space of permutation orbits. Let us enumerate k particles 
Xi, X2, . . . Xk nearest to the point x in order of their distance to x, pi = \\xi—x\\: 
Pi < P2 < ■ ■ ■ < Pfe- If all particles are in generic positions, then any two dis- 
tances are distinct. This ordered representation {xi, . . .Xk} has discontinuity 
points when some pi coincide. 

The following internal coordinates on the space of rotational orbits of germs 
give generically a representation of these orbits with discontinuity points when 
some Pi coincide. 

Let us enumerate k particles xi,X2, ■ ■ ■ Xk nearest to the point x in order of 
their distance to x, pi = \\xi — x\\: pi < p^ < ... < Pk- We assume that all 
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particles are in generic positions, hence, any two distances are distinct and 
three particles could not belong to one straight line. The distances from Xi to 
will be the main coordinates of the fc-germ. These are 3A; — 3 numbers: 
{pi}i=i,...k, {Pj}i=2,...fc, {p'/}i=3,...fc, where p'^ = \\xj - Xi\\, p'l = \\xi - X2\\. An 
additional set of coordinates consists of /c — 2 signs, ai — ±1, i — 3, . . . k. The 
triangle {x,xi,X2} belongs to a plane T. This plane divides the space into 
two half-spaces, !/+ and L_. We define the signs subscripts by triangle 

orientation {x,Xi,X2} according to the standard "screw rule" (or the "right- 
hand rule"). The sign = +1, if e L_|_, cTj = —1, if e L_. Generically 
there are no particles on F. The whole set of coordinates consists of 3A; — 3 real 
numbers and k — 2 signs. If we, in addition to rotation symmetry, assume the 
refiection symmetry, then there are only k—3 signs: (Tj = (J^CTj, j = 4, . . . k. The 
worst violations of the continuity condition for the proposed coordinates are 
jumps of basis triangle x, Xi,X2 near some of configurations. For example, for a 
body-centered cubic lattice there are three non-equivalent choices of particles 
xi,X2 nearest to the central particle x (in this symmetric case, pi = P2): 
along the cube edge, along a face diagonal, and along a main diagonal of the 
cube. Therefore, in the vicinity of this symmetric configuration, the distance 
p'2 — 11^2 — cannot be a continuous function of /c-germ (/c > 4). 

Statistical theory of shapes of finite sets in was launched in 1970s (see a 
survey [44]). Statistical analysis of configuration germs is an interdisciplinary 
area between statistics of shapes and statistical physics. 

2. 6 Dynamics in systems with strong order-disorder separation 

In previous subsections we discussed relations between measure and distance in 
high-dimensional systems with permutational symmetry. But the main prop- 
erty of the measure under consideration is its invariance with respect to me- 
chanical motion. In this subsection we consider dynamics in phase spaces with 
concentration. Without such a return to dynamics the consideration of order- 
disorder relations in statistical mechanics is incomplete, and we can loose some 
important effects. 

The strong order-disorder separation causes very special peculiarities of dy- 
namical systems with conservation of measure. Let P„ be the n-particle phase 
space for a system with strong order-disorder separation. Assume that for 
given 6 > the radius p{S,n) of a (1 — 5)-concentration ball B^^^^^l^ with 
the measure 1 — 5 goes to at n — cxd and the diameter of P,, is bounded: 
a > DiamP„ > (3 > 0.^ Phase flow transformations form a one-dimensional 
semigroup of injective maps : P„ — > P„, Tt (t > 0) is a shift over time t. 

^ We consider compact spaces to avoid trivial technical complications that are 
needed for locally compact spaces. 
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For any t > the map Tj keeps the most part of the 1 — (^-concentration ball 

P(rt(5;(M ) n ^pTn) ) > 1 - 2<^' (25) 



because the measure of complement of B^^°^^ in P„ is less than 6. 

For time averages of bounded differentiablc vector-functions on [0, oo[ (with 
bounded derivatives) an elementary identity holds: 

{f\t)) = -{{f{t),fm, (26) 



if all averages exist, hence, 

int)) < {nt)Y'%f\t))Yi\ (27) 



and 



(nm > cm 



Let us choose the origin in the center of B^f^^^y In this case, under standard 
assumptions, 

where (a^) is the average square of acceleration, is the average square of 
velocity. 

It means that in systems with concentration for given average square of ve- 
locity (w^) the average square of acceleration tends to oo with the number of 
particles, even if the velocity (n-particle) remains normalized. Just to imag- 
ine the orders let us assume: (v^) ~ \nkT, p'^{S,n) ~ n"^. In this case, 
(a^) > const X n^. 



Dynamics of particles with elastic collisions on an interval is equivalent to 
billiards in a multidimensional simplex (see elsewhere, for example [45]). Even 

if the particles are transparent (if there are no physical collisions at all), the 
symmetric representation of the system (by a point in the symplex) evolves 
with velocity jumps. These jumps take place every time when the particles 
change their order on the line. 
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For a functional representation of moving particles (16) (for any dimension of 
the one-particle space) the time derivative of fxi,x2,...,xni^) ^ discontinuity 
when the nearest to x particle changes its number. 

In all these cases the motion in symmetric coordinates is only piecewise differ- 
entiable, and average square of acceleration does not exist at all (is infinite). 

The described acceleration-dominated dynamics makes no differences between 
real physical interaction and jumps of velocities caused by geometry of permu- 
tation symmetry, for example. For motion of particles on a line, the particles 
can be transparent and do not interact at all. In this case one particle will 
come through the other, but any change of their order on a line causes in 
symmetric representation jump of velocities. On the other hand, particles can 
interact, coUide, and do not change their order on a fine. The result will be 
the same. For instantaneous elastic collisions the difference does not exist, 
but for softer potentials the picture of acceleration dominance holds also. The 
system without interaction is a billiard in the standard n-dimensional sym- 
plex. Interaction changes (smoothes) collisions and bends trajectories between 
them. 



3 Sticky faces and natural selection 

In this section we discuss general dynamical systems, not necessarily Hamil- 
tonian ones, or systems with conservation of volume. Let a multidimensional 
symplex be positively invariant with respect to dynamics: if a motion start in 
this symplex at some time to then it belongs to the symplex at any moment 
t > tg. For such a dynamical system we can guess that the motion spends 
most of time in a small vicinity of the symplex centre. It is a very natural ex- 
pectation because of the concentration of the symplex volume near its centre, 
and some theorems in the form "for a typical dynamical system with pos- 
itively invariant multidimensional symplex a typical motion spends most of 
the time in a small vicinity of the symplex centre" could be proved for the 
appropriate definition of typicallness. But there exists an important opposite 
type of dynamic behaviour. Let us assume that the faces of the symplex are 
also positively invariant. In this case, the typical picture of dynamic behaviour 
changes drastically: motions tend to a small vicinity of the small-dimensional 
skeleton of the symplex. 

Let us first explain the sense of such "sticky faces." The standard symplex A„ 
has natural interpretation as a space of n-dimensional probability distributions 
Pi, . . . ,Pn defined on n states. A dynamic system with positively invariant A„ 
is a kinetic equation. The faces of A„ are positively invariant, if the ith state 
could not be produced from the jth one for i ^ j, and only the birth-death 
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rate of ith state depends on the whole distribution pi, . . . ,pn. It is the general 
form of inheritance property, and such dynamical systems are standard objects 
for study in mathematical biology after Volterra [46], Lottka, and Gause [47]; 
review of some modern works could be found in [48] . 

The concentration of motions for t ^ oo in a small vicinity of the small- 
dimensional skeleton of the symplex is exactly the phenomenon of natural 
selection [49,51]. Many physical application of this phenomenon are known 
[52,53,54]. 

It is easy to demonstrate this phenomenon on the example of n particles 
moving on an interval [0, 1]. The effect of sticky faces implies here that if the 
position of the ith particle is or 1, then it does not moves. Following natural 
hypothesis of smoothness we can extract a multiplier Xi near and {1 — Xi) 
near 1 from the velocity of the particle at the position Xi. It means that in 
new coordinates i/i = InXj — ln(l — Xj) we can expect more or less uniform 
distribution of particles. But it is the equidistribution on the whole line. It 
is impossible in the classical sense, but if we take it seriously, we come to 
a finite-additive distribution (or to an approximation with equidistributions 
on a sequence of extended intervals). In any case, the expected number of 
particles at a given distance from the interval ends (or, in yi coordinates, at a 
given bounded interval) should be small in comparison with the total number 
of particles: almost all particles are concentrated near interval ends. 

The whole effect of sticky faces in a simplex means that if some coordinates 
Si are zero then their time derivatives are also zero. For particles moving on 
[0, 1] it implies that they stick to each other, and for such a system we can 
observe particle agglutination, in addition to particle concentration near the 
interval ends. 

In order to achieve exact estimation and theorems it is useful to start with an 
infinite number of particles. A variant of such a theory for continuous families 
of particles is developed in [49], see English version in [50]. The main result 
remains the same: for the dynamics on a symplex with sticky boundaries, 
almost all motions tend to a small vicinity of the small-dimensional skeleton 
of the symplex. Estimates of the skeleton dimension and asymptotic expansion 
for motions near this skeleton are also obtained. 



4 Discussion 

For a large number of particles the available phase space (or configuration 
space) could be divided into two parts. One part has microscopically small 
diameter (part D, disorder), another part (part O, order) has microscopically 
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small measure (volume). This is the strong order-disorder separation. 

Permutation invariance is crucial for the strong order-disorder separation. 
For example, the volume of a high-dimensional cube is concentrated near its 
boundary. After symmetrization the cube transforms into a simplex, and the 
volume of the simplex is concentrated near its center. Order is in the long, but 
thin wings of the simplex, while disorder is in the small, but thick vicinity of 
its center. This effect allows many generalizations for spaces of permutation 
orbits. 

All individual configuration of n distinguishable particles in R™- are equiva- 
lent: the measures of their ^-vicinities coincide, and are equal just to a volume 
of nm-dimensional ball of radius e. The permutation symmetry enforced us 
to replace any single configuration of n particles x by the set S„x of n\ con- 
figurations that are generated from x by particles permutations. These finite 
ensembles are already not equivalent. For a given bounded domain P C BJ^ 
(a box) and large n, there exists such a configuration xq of n particles ("al- 
most equidistribution" ) in P that £-fattening of Sxq, {S^Xoje, has "almost 
all" volume of the configuration space P": 

Vol({S„a;o}.)/Vol(P"^) >l-5 

for e ~ n^^/"^, and a given small number 5. If such a configuration exists, 
then, obviously, all points x from {S^Xole have the same property (with twice 
increased e): 

Vol({S„x}2.)/Vol(P'") >l-5. 

This finite ensemble S^a; is not an £-net in P", moreover, the rest, P"\{S„a;}£, 
has macroscopic (non-small) diameter and macroscopic Hausdorff distance 
from SnX. This is the essence of the strong order-disorder separation: disorder 
has microscopic diameter (but macroscopic, almost all measure), order has 
microscopic measure (but macroscopic diameter). 

The disordered states from the set D are macroscopically indistinguishable, 
because the distance between them is microscopically small. We can use a 
notion of "observability" from the control theory, and say that the difference 
between these states is macroscopically inobservable. 

In our definition of order and disorder we use the state-based approach: a 
state may be ordered or disordered. (Of course, in the definition of order we 
use ^-fattening (1), hence, an ensemble is present too, but the notion of order 
relates to states.) The state-based point of view in foundation of statistical 
physics becomes more popular very recently [55,56]. 

The time arrow that leads from order to disorder has the following interpre- 
tation: if a motion starts from an ordered state, then, after some time, the 
state becomes disordered, and we can be almost sure that it will remain dis- 
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ordered during time T with microscopically small inverse (probability of 
fluctuation from disorder to order could be estimated on the basis of Eq. (25)). 
Neither chaotic dynamics, no dynamical stirring have any relation to this be- 
haviour. Even ergodicity is not especially important: if ergodic components are 
multi-particle, then the same order-disorder separation is expected on them. 
But without strong order-disorder separation it is impossible even to formu- 
late such a statement: if a motion starts from an ordered state, then, after 
some time, the state becomes disordered, and ... 

Dynamics of systems with strong order-disorder separation has a very special 
property: in symmetric representation the average square of acceleration is 
very high. It can be interpreted as evolution through a series of collisions even 
for non- interacting particles. It is a hint to a possible solution of an essential 
open problem, the problem of indivisible events. For a macroscopically small 
time, a small microscopic subsystems can go through "its whole life", from 
the beginning to the limit state (or. more accurate, to the limit behaviour 
which may be not only a state, but a type of motion, etc.). The evolution 
of the microscopic subsystems in a macroscopically small time At should be 
described as an "ensemble of indivisible events". An excellent hint is given 
by the Boltzmann equation with its indivisible collisions, another good hint 
gives the chemical kinetics with indivisible events of elementary reactions. Now 
we understand that the solution could be found in the high acceleration for 
systems with strong order-disorder separation (Subsec. 2.6), but don't know 
yet even a form of a proper answer. 

The effect of strong order-disorder separation and time arrow direction from 
order to disorder turn to inverse, if we assume invariance of the boundary 
(sticky boundaries). If we consider these dynamical systems as kinetic equa- 
tions, the effect of sticky boundaries can be presented as inheritance: if some 
species (or genes - for our choice) are not present in the system now, they will 
not appear in the future. In this case, the evolution from disorder to order has 
a special name: Natural selection. Many applications of this effect are known 
in physics: from mode selection in lasers to wave turbulence. It is as general as 
order-disorder separation, and appears together with any sort of inheritance. 

The role of the permutation symmetry in statistical physics was discussed 
many times, from different points of view: as a basic axiom [57,58], as a prac- 
tical question related to entropy definition and measurement [59,60]; even an 
ontological status of this assumption was discussed quite thoroughly [61]. In 
this paper, in addition to this discussion, we demonstrate importance of per- 
mutation invariance for order-disorder separation and for direction of time 
arrow from order to disorder. 

The idea of measure concentration already affects even applied computer sci- 
ence [62]. The history of physical apphcations starts more than 100 years ago. 
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and now the measure concentration is one of the central ideas of statistical 
physics, we should only recognize this properly 

* * * 

And what about the children- and parents-rooms? Of course, the children- 
room has no permutation symmetry: any toy has it own sense, and a per- 
mutation destroys the sense of the configuration. But in parents-room there 
is perfect permutation symmetry: the toys there are just some things that 
should be returned in the toy-box. Hence, the children room is in order, but 
the parents room is in full disorder. Moreover, the children-room has no order- 
disorder separation, because each configuration has its own sense: the disorder 
is impossible in the children-room! 

Acknowledgements. I am very grateful to M. Gromov and H.C. Ottinger 
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